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Abstract 

Cassava is an important crop tliat provides food security and income generation in many tropical coun- 
tries and is I^nown for its adaptability to various environmental conditions. Despite its global importance, 
the development of cassava microarray tools has not been well established. Here, we describe the devel- 
opment of a 60-mer oligonucleotide Agilent microarray representing ^-^20 000 cassava genes and how it 
can be applied to expression profiling under drought stress using three cassava genotypes (MTAI16, 
MECU72 and MPER41 7-003). Our results identified about 1300 drought stress up-regulated genes in 
cassava and indicated that cassava has similar mechanisms for drought stress response and tolerance as 
other plant species. These results demonstrate that our microarray is a useful tool for analysing the 
cassava transcriptome and that it is applicable for various cassava genotypes. 
Key words: cassava; DNA microarray; expression profile; transcriptome 



1 . Introduction 

Cassava, Manihot esculenta Crantz, is a tropical crop 
that is important for food security and income gener- 
ation for many poor farmers in several Asian and 
African countries. More than 240 million tons of 
cassava are produced per year, and cassava serves as 
the primary food source for >7 50 million people. 
Cassava is one of the most efficient producers of 



carbohydrates and energy among all the food 
crops.^ Cassava is known for its adaptability to differ- 
ent soils and environmental conditions, particularly 
for its tolerance to drought. Cassava can withstand 
short and longest period of drought of around 4-6 
months.^ The drought stress response in cassava is fol- 
lowed by dehydration avoidance through deep root 
system, closure of stomata in dry air, and shedding 
of older leaves in which these features are effective 
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for survival under drought conditions. Upon recovery 
from water stress, cassava rapidly regenerates new 
leaves and leaf area index becomes higher compared 
with non-stressed plants.^ 

Partial sequencing of cDNA clones has been used as 
an effective method for gene discovery and in the last 
decade, the development of several EST collections 
has led to functional genomics studies in several 
plant species.'*'^ Large-scale cassava EST sequencing 
projects have been performed in various cassava re- 
search groups.^'^'^ Sakurai et al. constructed a full- 
length cDNA enriched library from cassava leaves 
and roots subjected to drought, heat, and acidic 
stress treatments, as well as from roots subjected to 
post-harvest physiological deterioration (PPD), a 
major obstacle for cassava commercialization.^ The 
cassava genome sequence is now publicly available 
and the initial assembly spans 419.5 Mb, covering 
54% of the estimated cassava genome size 
(770 Mb).^ At present, 30 666 protein-coding genes 
have been predicted from the genome sequence and 
3485 alternative splice forms have been supported 
by the ESTs.'' 

Microarray technology has demonstrated the power 
of high-throughput approaches to unravel key bio- 
logical processes and identify useful candidate genes 
and promoters for genetic engineering.' °~' ^ In 
cassava, a cDNA microarray containing ~5700 
unique cassava cDNA sequences has been prepared 
and used for studying expression profiles in response 
to Xanthomonas axonopodis pv. manihotis infection 
and during the PPD response. '^'''^ Oligo-DNA micro- 
arrays are gradually gaining importance due to the 
number of genes contained on each microarray, 
easier management of the system, and a greater 
dynamic range in the evaluation of expression 
levels.' ^''^ Recently, Yang et al^ prepared an 
Agilent 60-mer oligo microarray representing ~20 
000 genes and applied this microarray to study the 
expression profile of cassava genes during the tuberi- 
zation process. Although previous studies using 
cassava microarrays have provided valuable transcrip- 
tome information towards cassava molecular breed- 
ing, no reports on the cassava microarray studies 
under drought stress have been reported and only 
one genotype was used in the previous microarray 
studies. 

More than 6500 cassava germplasm accessions 
with varying phenotypes for biomass, abiotic stress 
tolerance, and resistance to harmful pathogens are 
available from the Genetic Resources Program of 
CIAT (http://isa.ciat.cgiar.org/urg/).' ^ Therefore, it is 
necessary to develop a cassava microarray that can 
be applied to various cassava genotypes that is essen- 
tial for identifying useful genes for genetic engineer- 
ing and advancing molecular breeding in cassava. 



In this study, we used three useful cassava geno- 
types: MTAI16 (also called KU50), MECU72, and 
MPER41 7-003. The detailed information for the 
three genotypes is shown in Section 2. Here we 
apply our newly developed cassava oligomicroarray 
to study the expression profile under drought stress 
and report that our cassava oligomicroarray can be 
used for analysing the cassava transcriptome in 
various cassava genotypes. 

2. Materials and methods 

2.1 . Plant materials and drought stress treatment 

The three cassava genotypes for the microarray ana- 
lysis were used: (i) MTAI16 (also known as KU50), 
which is one of the most important cassava genotype 
(A/1. esculenta Crantz), especially in Southeast Asia 
(including Thailand). It was developed through cross- 
breeding between Rayong 1 and Rayong 90 by breeders 
of the Kasetsart University, Department of Agriculture, 
Ministry of Agriculture and CIAT, and released officially 
in 1993 (http://www.tapiocathai.org/English/K2_e. 
html). It has high root yield, high starch content in 
root tubers, and vigorous plant growth with wide 
adaptability to unfavourable environmental condi- 
tions' (ii) MECU72, a naturally occurring cassava 
genotype (M. esculenta Crantz) with whitefly 
{Aleurotrachelus socialis) resistance and was isolated in 
Ecuador and (iii) MPER41 7-003, a wild landrace of M. 
esculenta subsp. peruviana, which shows resistance to 
whitefly and mealybug (Phenococcus herreni). MTAIl 6 
and MECU72 might be closely related, because they 
are classified as the same species. Previous polymorph- 
ism studies using AFLP and RAPD markers reported that 
the mean genetic similarity between M. esculenta and 
A/I. peruviana was 0.59, suggesting that M. esculenta 
and A/1. peruviana are closely related when compared 
with between M. esculenta and other wild 
species.^' '^^ Previously, the impact of water stress on 
yield and quality of cassava starch was studied in six var- 
ieties and this study indicated that MTAIl 6 recovered 
quickly from water stress and also had high starch 
content under recovery after drought stress compared 
with other varieties studied.'^ No studies on drought 
stress tolerance in MECU72 and MPER41 7-003 have 
been reported. 

The preparation of in vitro cassava plantlets was 
performed as follows: after plantlets were sterilized 
with 1% sodium hypochlorite solution, the plantlet 
was transferred to a glass pot with Murashige and 
Skoog (MS) media (pH5.8) containing 20g/l 
sucrose, 4.4 g/l MS salts containing vitamins 
(Duchefa), 2 |jlM CUSO4 (Wako), and 3.0 g/l gelrite 
(Wako). After cutting ~2-3 cm from the shoot top 
of the cassava plantlets, three shoot cuttings were 
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transferred to a glass pot with the MS media and 
grown under 1 6-h illumination of 40-80 |xmol 
photons m~^ s~^ at 30°C. The roots were formed 
from the sections of the cutting shoots and the plant- 
lets were grown until ~5cm high during 1 month 
and then used as experimental materials. The pheno- 
types of these genotypes under non-treated condi- 
tions were very similar (Fig. 1). For the untreated 
control samples, the shoots were harvested and 
frozen in liquid nitrogen. The plantlets were also sub- 
jected to drought treatment by transferring the plant- 
lets from the glass pot onto a plastic plate and 
maintaining them for 1 h under 40-80 |xmol 
photons m~^ s~^ at 30°C in 50% relative humidity. 
After the drought stress treatment, all of the leaves 
from the three cassava genotypes wilted (Fig. 1). 
After removing roots from the plantlets, the shoots 
were immediately frozen in liquid nitrogen and 
stored at -80°C until the RNA preparation. Three in- 
dependent biological replicative experiments were 
performed for each treatment and each genotype as 
follows: (i) MTAI16 (untreated control), (ii) MTAIl 6 
(1 h drought treatment), (iii) MECU72 (untreated 
control), (iv) MECU72 (1 h drought treatment), 

(v) MPER41 7-003 (untreated control), and 

(vi) MPER41 7-003 (1 h drought treatment). 

2.2. RNA extraction 

Total RNA was extracted from six shoots (~800 mg 
fresh weight (FW)) per experiment using an RNeasy 
Plant Mini Kit (QIAGEN) according to the manufac- 
turer's instructions (QIAGEN). The total RNA extracts 
were treated with RNase-free DNase I (QIAGEN) to 
completely remove genomic DNA. The total RNA 



No treatment Drought treatment 



MTAI16 





MECU72 




MPER 
417-003 





Figure 1. Phenotype of cassava genotypes after 1-h drought-stress 
treatment. 



quality was evaluated with electrophoresis using the 
Bioanalyzer system (Agilent). The extracted total 
RNA was stored at -80°C until further use. 



2.3. Oligomicroarray design 

Figure 2 shows an overview of the development of 
our cassava oligomicroarray. We retrieved 76 566 
ESTs and 86 cDNA (from high-throughput cDNA cat- 
egory) sequences from GenBank in February 2009 as 
starting sequences for the microarray probe design. 
The sequences used are derived from various cassava 
genotypes, such as MTAIl 6, CAS36.01, CAS36.04, 
CM21 772, CM523-7, MCol22, lAC 1 2.829, 
MBra685, MColl 522, MNga2, MPerl 83, Mirassol, 
SGI 07-35, Sauti, Gomani, Mbundumali, TME 1, 
Ml<ondeziTMS30572, and CM2177-2. Some 
sequences deposited in GenBank were contaminated 
by non-native sequences derived from cloning 
vectors, bacterial hosts, and other sources. In addition, 
abundant repetitive elements in a sequence set 
decreased our ability to accurately assemble the se- 
quence. Therefore, contamination and repetitive ele- 
ments in the retrieved sequences were masked via 
the cross match program in the Phred/Phrap 
package^^ by using the -minmatch 10 -minscore 
20 parameters and comparison with the NCBI 
UniVec and MSU Plant Repeat Database, respectively. 



GenBank se<|uence reposKory~ 

Retrieve cassava EST/cDNA data from GenBank 
76,652 sequences as starting information 



I 



Mask contamination, repetitive elements compared with 
NCBI UniVec and MSU Plant Repeat DB respectively, 
and polyA or polyT region by using the cross-match 
program 



Discard sequences shorter than 100 bp 
76,568 cleaned sequences 



Assemble via the CAPS program 



11,422 contigs and 18,214 singlets 




Evaluate sequence orientation via BLASTX against 
protein datasets and sequence direction information 
(forward or reverse read EST) 



target sequences for oligo microarray prot>e deslgiT 

Design 60-mer microarray probes via Agilent eArray 



21,522 unique selected 60-mer probes 

P Duplicate the selected probes as one experiment pack 

Agilent 44k x 4 pack platform cassava oligo microarray 
Figure 2. Design of the cassava oligo-DNA microarray. 
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Similarly, polyA and polyT sequences were also 
masked. We then omitted sequences that are 
shorter than 1 00 bp from the design process. In 
order to eliminate the sequence redundancy, the 
remaining sequences were assembled with the CAP3 
program^'* by using the -p 95 parameter. This condi- 
tion is tighter than one for redundancy omission 
alone. All the public ESTs which were used in our as- 
sembly were not deposited as the sense strand and 
the contigous sequences in our assembly included 
antisense sequences. Therefore, we evaluated the con- 
tiguous sequences using the public protein sequence 
sets. The sense strands of the assembled sequences 
were evaluated using the BLASTX program^^ against 
the three databases, NCBI RefSeq plant (released 
January 20, 2009),^*^ UniProt TrEMBL plant (released 
3 March 2009),^^ and the predicted protein 
sequences from poplar (JGI Poptrl_l) and castor 
bean genome sequences (GenBank accession 
numbers EQ973772-EQ999533). We identified the 
sequence that aligned with the three protein sequence 
sets with the same translation frame as the appropriate 
sequence for the microarray probe design. 

Finally, the sequences filtered through the sense 
direction evaluation were submitted to the eArray 
application (https://earray.chem.agilent.com/earray/) 
of Agilent Technologies (Santa Clara, CA, USA) to 
design the 60-mer oligomicroarray probes using 
'Best probe methodology' and 'Design with 3' bias' 
options in order to prevent the occurrence of cross- 
hybridization. Note that only 2.6 and 0.6% had >80 
and 90%, respectively, similarity with other probes 
on the array (Supplementary Table SI). About 60% 
of the probes had >l-bp overlap with the structural 
portion of genes. Sixty probes are added as non- 
plant or negative/positive control sequences on the 
array. The information is available at the Gene 
Expression Omnibus (GEO), accession number 
GPL14139. 



2.4. Microarray hybridization 

Cyanine-3 (Cy3)-labeled cRNA was prepared from 
0.5 |xg of total RNA using the Quick Amp Labeling 
kit (Agilent Technologies) according to the manufac- 
turer's instructions (http://www.chem.agilent.com/ 
Library/usermanuals/Public/G41 40-9 0040_Gene 
Expression_One-color_v6. 5.pdf), followed by RNeasy 
column purification (Oiagen). Dye incorporation and 
cRNA yield were checked with the NanoDrop ND- 
1000 Spectrophotometer. The Cy3-labelled cRNA 
(1 .65 |jLg) was fragmented at 60°C for 30 min accord- 
ing to the manufacturer's instructions. Agilent hybrid- 
ization buffer was then added to the fragmentation 
mixture and hybridized to the Agilent microarray 
(GPL14139) for 17 h at 65°C in a rotating Agilent 



hybridization oven. After hybridization, the microarrays 
were washed using the optimized protocol (http:// 
www.chem.agilent.com/Library/usermanuals/Public/ 
G41 40-90040_GeneExpression _One-color_v6.5.pdf) 
recommended by Agilent technologies. Slides were 
scanned immediately after washing on the Agilent 
DNA Microarray Scanner (G2 505B) using one color 
scan setting for 4 x 44 K array slides. Signal intensities 
were detected from the obtained digital images 
using Feature Extraction software (Ver. 9.1; Agilent 
Technologies). Three independent biological replica- 
tive experiments were performed for each treatment 
and for each genotype. 

2.5. Microarray data analysis 

Total 1 8 expression data obtained by microarray 
analysis were exported to GeneSpring GX (Agilent 
Technologies) and per chip normalization to the 
quantile expression level and per gene normalization 
to the median expression intensity were performed 
in all samples. Data were transformed into the log 2 
ratio for display and analysis. The following calcula- 
tions were performed according to the manufac- 
turer's instructions of GeneSpring GX. Briefly, the 
microarray data were filtered to remove control 
probe sets and those probe sets with an intensity 
value close to background levels. The remaining 
genes were filtered based on the deviation of the 
intensity values within a condition. The remaining 
genes were placed together in one list. The genes 
selected in this way were further filtered to remove 
those probe sets whose expression change under all 
experimental conditions was below a threshold, 
based on median normalized intensity values, which 
was considered to be the no-change threshold. The 
resulting working gene list of transcripts for each 
experiment was used for the statistical analysis. To 
test for differential expression, an analysis of variance 
test was carried out between non-treated and 
drought-treated samples. The changes in gene expres- 
sion were statistically analysed by the unpaired t-test 
(threshold was set at P< 0.01) for two groups (non- 
treated and drought-treated samples). The false 
discovery rate (q-value) was calculated for each P 
value according to the method of Benjamini and 
Hochberg (1 995).^^ The information from the cassava 
oligomicroarray is available at the GEO of NCBI. The ac- 
cession numbers are Platform, GPL14139; Series, 
GSE3 1 749; Samples, GSM787966-GSM787983. 

When the total 21 522 sequences on the array 
were searched by BLASTX program against 35 176 
Arabidopsis protein sequence data sets in the nuclear 
genome. (TAIRl 0, ftp://ftp.arabidopsis.org/home/tair/ 
Genes/TAIRl O_genome_release/TAIR1 0_blastsets/ 
TAIRl 0_pep_201 01 214),20436sequenceshad hits 
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with/Aral^/t/ops/s proteins at £ value < le~^. The top hits 
were used for gene annotations, and the corresponding 
AGI code was used for the functional classification using 
the gene ontology (GO) of TAIRl 0 (http://www. 
arabidopsis.org/tools/bulk/go/index.jsp). The percent- 
age of the up-regulated, down-regulated, overrepre- 
sented and underrepresented genes was calculated as 
follows: percentage (%) = (number of the genes classi- 
fied into the GO term)/(total number of the genes 
used for the classification). 

2.6. Quantitative real-time RT-PCR (qPCR) analysis 

First-strand cDNA synthesis was performed with a 
Superscript® VILO™ cDNA synthesis kit (Invitrogen) 
using random hexamer primers. After denaturing 
the total RNA (2.0 |jLg) at 70°C for 5 min, reverse tran- 
scription was performed for 1 h at 42°C with total 
RNA, 4 |jlI of Superscript VILO Reaction Mix, and 2 |xl 
of Superscript Enzyme Mix in a 2 0 |jlI total volume. 
The reaction was stopped by heating for 5 min at 
85°C. The first-strand cDNA preparations were 
stored at "30°C until use. The basic procedure of 
qPCR was carried out on an ABI PRISM 7000 
(Applied Biosystems) using a cDNA mixture corre- 
sponding to 1 .5 ng of total RNA, 1 0 |xl of Fast SYBR® 
green master mix (Applied Biosystems), 0.5 jjlM of 
both forward and reverse primers in a 20 jjlI total 
volume. The gene-specific primers used for qPCR are 
listed in Supplementary Table S2. The sequences of 
all primer sets were designed using the Primer 3 
program (http://primer3.sourceforge.net/). The PGR 
cycling conditions were as follows: 95°C for 20 s, for 
initial denaturation followed by 45 cycles of 95°C 
for 5 s and 60°C for 30 s. The specificity of the PGR 
amplification was evaluated with a melting curve 
analysis (from 55°G to 95°G) of the band pattern of 
the amplification product after the final cycle of the 
PGR. Each plate also incorporated a no-template 
control. We employed probes specific for the ubiqui- 
tin-conjugating enzyme 4 (UBG4) gene from cassava 
as references. The qPGR data was analysed with the 
AGT method using a reference gene. For each 
sample, the mRNA levels of target genes were normal- 
ized to that of the UBG4 mRNA. 

3. Results and discussion 

3.1 . Cassava oligomicroarray design 

For the development of the cassava oligomicroarray, 
we designed 60-mer oligonucleotide probes based on 
76 652 cassava cDNA sequences (Fig. 2; 76 566 ESTs 
and 86 high-throughput cDNA sequences retrieved 
from GenBank in February 2009). We omitted non- 
native sequences derived from cloning vectors, bacter- 
ial host sequences, and sequences shorter than 



1 00 bp. As a result of this cleaning process, the 
number of sequences was reduced to 76 568. In 
order to eliminate the sequence redundancy, the 
remaining sequences were assembled using the 
GAP3 program, which resulted in the identification 
of 29 636 non-redundant sequences (11422 
contigs and 18 214 singlets). We then performed a 
BLASTX search with the target sequences against 
NGBI RefSeq plant, UniProt TrEMBL plant, and the pre- 
dicted protein sequences from poplar and castor bean 
genome sequences to evaluate the transcriptional dir- 
ection, because some public ESTs were not deposited 
as sense strand sequences. This analysis identified 2 5 
708 potential target sequences (average size: 853 bp) 
for the 60-mer probe design. To obtain more individ- 
ual sequences among cassava varieties, we assembled 
the cassava expressed-sequences using the option 
p 9 5. We then applied the selected probes to the 
design via Agilent e-array (https://earray.chem. 
agilent.com/earray/) using 'Best probe methodology' 
and 'Design with 3' bias' options to prevent cross- 
hybridization (Fig. 2). Finally, 21 522 unique micro- 
array probes were selected via the Agilent's eArray 
application. About 52% (1 7 753) of the total pre- 
dicted cassava GDS (34 151) are supported by 21 
52 2 probes. Please note that when we designed the 
cassava microarray in 2009, 18 591 cassava GDS 
have been supported by the ESTs and this array has 
the probes corresponding to 89% (1 6 61 9) of them. 

When the total 21 52 2 sequences were searched by 
BLASTX program against 35 1 76 Arabidopsis protein 
sequences present in the nuclear genome (TAIRl 0, 
ftp://ftp.arabidopsis.org/home/tair/Genes/TAIRl 0_ 
genome_release/TAIRl 0_blastsets/TAIRl 0_pep_ 

2 01 01 214), 20 436 sequences had hits with 
Arabidopsis proteins. Among them, 1 6 875 sequences 
had hits with 9748 Arabidopsis proteins at £ value 
<le~^. The top hits were used for gene annotation 
and analyses of the GO (please see below). Note 
that the percentage of the probes on the array 
having >80 or 90% similarity with other probes is 
2.6 or 0.6, respectively (Supplementary Table SI). 
We then adopted the Agilent 44k oligomicroarray 
platform and the selected probes were duplicated in 
the randomized layout on the array. We have also 
added the 60 probes of non-plant or negative/posi- 
tive control sequences on the array. 

Similar type of 60-mer Agilent cassava oligomi- 
croarray has been also developed^ ^ and applied to 
study the expression profiles during storage root for- 
mation in cassava. The previous studies have applied 
it to only one cultivar, TMS60444 and have not 
demonstrated that the probes on the array could hy- 
bridize with the cRNAs prepared from various cassava 
genotypes. Our array differs from the previous array in 
the following ways: (i) slightly more ESTs were used to 
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design the array (76 566 vs. 71 520), (ii) more genes 
are represented on our arrays (21 522 vs. 20 840), 
and (iii) care was taken to examine the transcript dir- 
ection on our array. 

3.2. Cassava oligomicroarray can be used for 
transcriptome analyses in various cassava 
genotypes 

We applied the 22-k cassava oligomicroarray to 
study the expression profiles of three cassava geno- 
types, MTAI16, MECU72, and MPER41 7-003 under 
drought stress (Fig. 1) as described in Section 2. 
After filtering (see Section 2), the genes with the fol- 
lowing characteristics were selected: (i) the signal in- 
tensity is higher than the local background plus 2.6 
times of the standard deviation; (ii) the signal inten- 
sity is not saturated; and (iii) the spot is uniform. We 
adopted the Agilent standard protocol for the oligo- 
microarray kit in order to keep the objectivity of the 
statistics (http://www.genomics.agilent.com/files/ 
Manual/G4460-90026_FE_Reference.pdf). The total 
numbers of valid gene probes that were identified as 
'expressed' with the above characteristics were 1 6 
888, 1 7 772, and 1 7 037 from MTAI1 6, MECU72, 
and MPER41 7-003, respectively (Table 1). These 
results showed that 77-83% from a total of 21 522 



genes were detectable by hybridization to the cRNA 
from the tissues. 

To show the feasibility of our oligomicroarray for 
cassava genotypes with heterologous chromosomes, 
the signal intensities of the above-selected probes 
were plotted, and the correlation coefficients (R^) 
among three cassava genotypes under a non-treated 
condition were evaluated (Table 1). The values 
were 0.959, 0.967, and 0.879 between MTAI1 6 and 
MECU72, MTAI16 and MPER41 7-003, and MECU72 
and MPER41 7-003, respectively. It is worth noting 
that the value between MECU72 and MPER41 7- 
003 was slightly lower than that of other combina- 
tions. This might be due to the following reasons: (i) 
the MTAI1 6 ESTs have been used for the microarray 
design, but the ESTs from MECU72 and MPER41 7- 
003 have not been used and (ii) the chromosomes 
of MECU72 and MPER41 7-003 might be more heter- 
ologous compared with MTAI1 6. To examine the re- 
producibility of the experiments, we calculated the 
coefficient of variation (CV) value for the signal inten- 
sities in the three replicative microarray experiments. 
In this study, we used four types of classification 
based on the CV value, that is, <0.2 5, between 0.2 5 
and 0.50, between 0.50 and 0.75, and >0.75. More 
than 90% of all signals have CV values of <0.50 in 



Table 1 . Number of expressed genes and data reproducibility among experiments using tliree cassava genotypes 



Cassava genotypes 


Number of expressed genes'" 


Correlation coefficient 










MTAIl 6 


MECU72 


MPER41 7-003 


MTAIl 6 


1 6888 


1.000 






MECU72 


1 7772 


0.959 ± 0.014 


1.000 




MPER41 7-003 


1 7037 


0.967 ± 0.005 


0.879 + 0.014 


1.000 



^Total number of the gene probes with the following characteristics in six experiments (both non-treated and drought- 
treated samples): (i) the signal intensity was higher than the local background plus 2.6 times of the standard deviation; 
(ii) the signal intensity was not saturated, and (iii) the spot is uniform. 
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Figure 3. Variation in signal intensities in microarray experiments using three cassava genotypes. Black and white bars indicate non-treated 
and drought-treated samples, respectively. CVs were calculated for signal intensities in the three independent hybridizations. 
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Upregulated 
(>two-fold induction) 



B 



Down regulated 
(>two-fold repression) 



MECU72 
(total 305 
genes) 




MTAI16 
(total 1078 
genes) 



MECU72 
(total 419 
genes) 




MTAI16 
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Figure 4. Venn diagram analysis of the genes up-regulated (A) or down-regulated (B) by drought stress treatment in the three cassava 
genotypes. In each genotype, the drought stress-up-regulated genes were identified as follows: (i) ratio (drought treatment/no 
treatment) >2, (2) Mest, P-value < 0.01 , and (3) FDR (q-va\ue) <0.1 according to the method of Benjamini and Hochberg (1 995). 
The drought stress-down-regulated genes were identified as follows: (i) ratio (drought treatment/no treatment) <0.5, (ii) t-test, 
P-value < 0.01 , and (iii) FDR (q-va\ue) <0.1 according to the method of Benjamini and Hochberg (1 995). 



both the control and drought-treated conditions 
(Fig. 3). These results indicate that there are few sig- 
nificant differences in the signal intensities among 
the three genotypes and that our oligomicroarray 
can be applied to multiple cassava genotypes. 

Genes whose expression levels changed > 2-fold in 
response to drought treatment were selected using 
statistical methods (see Section 2). The following 
three criteria were used for the gene selection: (i) 
the signal intensity changed > 2-fold between the 
non-treated and drought-treated samples; (ii) The P 
value from the Mest for two groups (non-treated 
and drought-treated samples) was <0.01 and (iii) 
FDR ((7-value) according to the method of Benjamini 
and Hochberg was <0.1.^^ The number of drought 
stress up-regulated genes was 1078, 305, and 671 
in MTAI1 6, MECU72, and MPER41 7-003, respectively 
(Fig. 4) and the number of drought stress down-regu- 
lated genes was 597, 419, and 238 in MTAI1 6, 
MECU72, and MPER41 7-003, respectively (Fig. 4). 
These results suggest that MTAI1 6 has the specific 
system for adaptability under various stress condi- 
tions, such as drought stress. Previous studies reported 
that MTAI1 6 recovered quickly from stress and also 
had high starch content under drought recovery 
compared with other varieties studied.'^ Sequencing 
analysis of the full-length cDNA clones from MTAI1 6 
subjected to various stress treatments, such as 
drought, identified many putative gene duplications 
that might have played a role in cassava stress 
responses.^ A total of 168 genes (Fig. 4; 
Supplementary Table S3) and 69 genes (Fig. 4; 
Supplementary Table S4) were up-regulated and 
down-regulated, respectively, by drought stress in all 
three genotypes. 



3.3. Cassava has similar mechanisms for drought 
stress response and tolerance as other plants, 
such as arabidopsis 

Drought can be a major environmental constraint 
affecting the growth and physiology of many plant 
species. As a result, many plants have developed strat- 
egies to defend against damage caused by drought 
stress. In this study, we identified 168 genes that 
were up-regulated by drought stress in three cassava 
genotypes. Among them, there were many homologs 
of drought-inducible genes that were identified in 
previous studies of other plant species, such as 
Arabidopsis and rice (Supplementary Table S3).^°~^^ 

Functional category classification using the GO of 
the Arabidopsis Information Resource (TAIR1 0; http:// 
www.arabidopsis.org/tools/bulk/go/index.jsp) for 
three components ('biological process', 'molecular 
function' and 'cellular component') was performed 
on the 1 68 drought stress up-regulated or 69 down- 
regulated genes (Fig. 5). Many drought stress up-regu- 
lated genes were classified into the GO terms 'othercel- 
lular processes', 'other metabolic processes', 'unknown 
biological processes', 'response to abiotic or biotic 
stimulus' and 'response to stress' for biological process. 

The drought stress up-regulated genes classified in a 
GO term 'response to stress' (Supplementary Table 
S5) include the homolog of a key gene in the biosyn- 
thetic pathway for abscisic acid (ABA), AtNCED3 
(AT3G1 4440),^^ which encodes a member of the 
9-cis-epoxycarotenoid dioxygenases; an NAC tran- 
scription factor homolog, RD26 (AT4G2741 0)^'^ 
that is involved in the ABA-dependent drought stress 
signaling pathway; and a homolog of a drought-indu- 
cible galactinol synthase, AtGolSI (AT2G471 80),^^ 
which encodes the key step in the biosynthesis of 
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raffinose family oligosaccharides, osmolytes that play 
a role in drought stress tolerance. Among the 
drought-inducible genes identified, we also found 
homologs of several jasmonate ZIM-domain (JAZ) 
proteins^^ that function in jasmonate pathway 
responses, such as wounding. This is most likely the 
result of cross-talk between ABA and JA signaling 
that has been observed in previous studies.^^ In 
cassava, the interaction between ABA and JA might 
function in protecting the plants from the water loss 
that occurs under drought stress conditions. This 
might also be due to the fact that the cassava plants 
used for this microarray analysis were subjected to a 
wounding stress when the plantlets were removed 
from the gelrite medium during the drought stress 
treatment and when the shoots were cut from the 
plantlets prior to storage at -80°C. 

Many drought stress down-regulated genes were 
classified into the GO terms 'other intracellular 



components', 'other cytoplasmic components', 
'chloroplast', 'other membranes' and 'plastid' for 
cellular component (Fig. 5C). The drought stress 
down-regulated genes classified into the GO terms 
'chloroplast' or 'plastid' include the homologs of a 
light-harvesting chlorophyll a/b-b\nd\ng protein, 
LHCA4 (AT3G47470) and a chloroplast triose 
phosphate/3-phosphoglycerate translocator, APE2 
(AT5G46110) genes (Supplementary Table S4). 
These results are consistent with previous reports that 
drought stress inhibits photosynthesis.^^ Repression 
of photosynthesis under drought stress also occurs in 
cassava in the same way as other plants and might 
help the plants to survive under the stress. 

Hierarchical clustering analysis of the 1 68 drought 
stress up-regulated and 69 down-regulated genes 
revealed similar expression pattern of the genes in 
both non-treated and drought stress-treated condi- 
tions among the three genotypes (Fig. 6). The venn 
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Figure 5. Percentage of GO terms for (A) biological process, (B) molecular function, and (C) cellular component of the genes up-regulated 
(1 68 genes in Supplementary Table S3; black bar) and down-regulated (69 genes in Supplementary Table S4; white bar) by drought 
stress treatment in the 3 genotypes (MTAIl 6, MECU72, and MPER41 7-003). 
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Figure 6. Hierarchical clustering analysis of the genes up-regulated 
(1 68 genes) and down-regulated (69 genes) by drought stress 
treatment in the three cassava genotypes. The signal intensity 
values for each sample were transformed to log2 values and 
subjected to hierarchical clustering using standard correlation. 
The genes with higher and lower signal intensity values are 
shown in red and blue, respectively. The genes with the signal 
intensity value of a median level are shown in yellow. 



diagram analysis also identified many genes up- 
regulated or down-regulated specifically in each 
genotype by drought stress treatment (Fig. 4; 
Supplementary Tables S6-S1 7). These genes include 
genes with unknown function and ones with various 
functional categories, such as transcription factors, 
protein kinases, stress response-related ones and me- 
tabolism-related ones. The GO term analyses for bio- 
logical process showed that many genes up-regulated 
specifically in each genotype by drought stress were 
classified into the GO terms 'other cellular processes' 
and 'other metabolic processes'. We could not find 
big differences on the pattern of GO term among 
the differentially regulated gene groups in each geno- 
type (Supplementary Figs SI -S3). 

Many differentially expressed genes under the same 
conditions (no treatment and drought stress treat- 
ment) were identified between the genotypes 
(Supplementary Tables S18-S30) and the GO 
term analyses of the genes were performed 
(Supplementary Figs S4-S6). The number of the 



differentially expressed genes was larger in 
MPER41 7-003/MECU72 compared with that in 
MTAI16/MECU72 and MTAIl 6/MPER41 7-003. The 
differentially expressed genes between the genotypes 
include the genes with unknown function and the 
genes with various functional categories, such as tran- 
scription factors, protein kinases, transporters and 
metabolism-related ones. 



3.4. Validation of the microarray data by qPCR 

To evaluate the expression profiles obtained by 
microarray analysis, we also performed qPCR analysis. 
Nine genes were randomly selected from the genes 
shown in Supplementary Tables SI and S3. The 
cassava homolog of the ubiquitin UBC4 gene 
(Supplementary Table S2) was used as a control 
because its expression level was constant between 
no treatment and the 1 -h drought treatment. The 
results of the qPCR analysis were consistent with 
those of the microarray analysis (Fig. 7). The 
between the two experiments for the 2 7 total plots 
was 0.904. These results show that the cassava micro- 
array provides reliable data and can be used for tran- 
scriptome analyses in various cassava genotypes with 
heterologous chromosomes. 
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Figure 7. Confirmation of microarray data by qPCR analysis. A 
scatter plot between the log2-transformed ratio (drought 
treatment/no treatment) measured by qPCR analysis (X axis) 
and those (drought treatment/no treatment) obtained by the 
microarray analysis (Y axis). White circles, black circles and 
white squares indicate the data from MTAIl 6, MECU72 and 
MPER41 7-003, respectively. Correlation coefficient (R^) was 
0.904. 
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